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Accurate treatment of the long-range electron correlation energy, including van der Waals (vdW) 
or dispersion interactions, is essential for describing the structure, dynamics, and function of a wide 
variety of systems. Among the most accurate models for including dispersion into density functional 
theory (DFT) is the range-separated many-body dispersion (MBD) method [A. Ambrossetti et ai, 

J. Chem. Phys. 140, 18A508 (2014)], in which the correlation energy is modeled at short-range by 
a semi-local density functional and at long-range by a model system of coupled quantum harmonic 
oscillators. In this work, we develop analytical gradients of the MBD energy with respect to nuclear 
coordinates, including all implicit coordinate dependencies arising from the partitioning of the charge 
density into Hirshfeld effective volumes. To demonstrate the efficiency and accuracy of these MBD 
gradients for geometry optimizations of systems with intermolecular and intramolecular interactions, 
we optimized conformers of the benzene dimer and isolated small peptides with aromatic side-chains. 

We find excellent agreement with the wavefunction theory reference geometries of these systems (at 
a fraction of the computational cost) and find that MBD consistently outperforms the popular TS 
and D3(BJ) dispersion corrections. To demonstrate the performance of the MBD model on a larger 
system with supramolecular interactions, we optimized the CgQ@CgQH 2 g buckyball catcher host- 
guest complex. Finally, we find that neglecting the implicit nuclear coordinate dependence arising 
from the charge density partitioning, as has been done in prior numerical treatments, leads to an 
unacceptable error in the MBD forces, with relative errors of ~ 20% (on average) that can extend 
well beyond 100%. 


I. INTRODUCTION 

A theoretically sound description of noncovalent inter¬ 
actions, such as hydrogen bonding and van der Waals 
(vdW) or dispersion forces, is often crucial for an accu¬ 
rate and reliable prediction of the structure, stability, and 
function of many molecular and condensed-phase sys¬ 
tems Dispersion interactions are inherently quantum 
mechanical in nature since they originate from collec¬ 
tive non-local electron correlations. Consequently, they 
pose a significant challenge for electronic structure the¬ 
ory and often require sophisticated wavefunction-based 
quantum chemistry methodologies for a quantitatively 
(and in some cases qualitatively) correct treatment. Over 
the past decade, this challenge has been addressed by 
a number of approaches seeking to approximately ac¬ 
count for dispersion interactions within the hierarchy of 
exchange-correlation functional approximations in Kohn- 
Sham density functional theory (DFT),which is ar¬ 
guably the most successful electronic structure method 
in widespread use to day throughout chemistry, physics, 
and materials science.!^ 

Based on a summation over generalized interatomic 
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London (Cq/R^) dispersion contributions, the class of 
pairwise-additive dispersion methods provide a simple 
and computationally efficient avenue for approximately 
incorporating these ubiquitous long-range interactions 
within the framework of DFT. (See Ref. [5^ for a re¬ 
cent and comprehensive review of dispersion methods in 
DFT.) Although these pairwise-additive methods are ca¬ 
pable of reliably describing the dispersion interactions 
in many molecular systems, it is now well known that 
both quantitative and qualitative failures can occur, as 
demonstrated recently in the binding energetics of host- 
guest complexes}^ conformational energetics in polypep¬ 
tide a-helices,l^ cohesive properties in molecular crys- 
taisIMMl relative stabilities of (bio)-molecular crystal 
polymorphs and interlayer interaction strengths in 

layered materialsj A l ^s i to name a few. 

In each of these cases, the true many-body nature 
of dispersion interactions becomes important, whether 
it is due to beyond-pairwise contributions to the dis¬ 
persion energy, such as the well-known three-body 
Axilrod-Teller-Muto (A TM) ter rn,!^ electrodynamic re¬ 
sponse screening effects or the non-additivity of 

the dynamic polarizability.l^ One of the most success¬ 
ful models for incorporating these many-body effects 
into DFT is the many-bod y dispersion (MBD) model of 
Tkatchenko et a; D6 | 47 | 52 l53] approximates the long- 

range correlation energy via the zero-point energy of a 
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model system of quantum harmonic oscillators (QHOs) 
coupled to one another in the dipole approximation. 
The correlation energy derived from diagonalizing the 
corresponding Hamiltonian of these QHOs is provably 
equivalent to the random-phase approximation (RPA) 
correlation energy (through t he ad iabatic-connection 
fluctuation-dissipation theorem) .^21111 xhe MBD model 
has consistently provided improved qualitative and quan¬ 
titative agreement with both e xperi mental results and 
wavefunction-based benchmarks .SSE3 Notably, MBD cor¬ 
rectly predicts the experimentally known relative stabili¬ 
ties of the molecular crystal polymorphs of glycind^ and 
aspirin,!^ which pairwise methods fail to do. Refs. EH 
and [68| offer recent perspectives on the role of non¬ 
additive dispersion effects in molecular materials and the 
key successes of the MBD model. 

In this work, we seek to extend the applicability of the 
MBD model by deriving and implementing the analytical 
gradients of the range-separated many-body dispersion 
(MBD@rsSCS) energy with respect to nuclear coordi¬ 
nates, thereby enabling efficient geometry optimizations 
and molecular dynamics simulations at the DFT+MBD 
level of theory. This paper is principally divided into a 
theoretical derivation of the analytical forces in the MBD 
model (Sec. 0- and a discussion of the first applications 
of these analytical MBD forces to the optimization of 
isolated molecular systems (Sec. IV). In Secs. H Apl B 


we start by presenting a self-contained summary of the 
MBD framework to clarify notation and highlight the 
different dependencies of the MBD energy on the nu¬ 
clear coordinates. We then derive analytical nuclear gra¬ 
dients of the MBD@rsSCS correlation energy (Sec. HCl. 
In Sec. [ITT] and Sec |VH J] of the accompanying Electronic 
Supplementary Information (ESI)P^, we give computa¬ 
tional details. Subsequently, we demonstrate the impor¬ 
tance of MBD forces for several representative systems 
encompassing inter-, intra-, and supra-molecular interac¬ 
tions (Secs. [rVA]|rVCt . We finally examine the role of 
the implicit nuclear coordinate dependence that arises 
from the partitioning of the electron density into effec¬ 
tive atomic volumes (Sec. IV DI and conclude with some 
final remarks on potential avenues for future work. 


II. THEORY 

A. Notation employed in this work. 

As the theory comprising the MBD model has evolved 
over the past few years, several notational changes have 
been required to accommodate the development of a more 
complete formalism that accounts for the various contri¬ 
butions to the long-range correlation energy in molecu¬ 
lar systems and condensed-phase materials. In this sec¬ 
tion, we provide a current and self-contained review of 
the MBD@rsSCS model followed by a detailed deriva¬ 
tion of the corresponding analytical nuclear gradients 
(forces). Our discussion most closely follows the notation 


employed in Refs. I52I53I To assist in the interpretation 
of these equations, we have also furnished a glossary of 
symbols utilized in this work as part of the ESI.I^ For 
a more thorough discussion of the MBD model (includ¬ 
ing its approximations and physical i nterpr etations), we 
refer the reader to the original works^SES gg g 

recent revievi^ on many-body dispersion interactions in 
molecules and condensed matter. 

Throughout this manuscript, all equations are given 
in Hartree atomic units (Ti = me = e = 1) with tensor 
(vector and matrix) quantities denoted by bold typeface. 
In this regard, one particularly important bold/normal 
typeface distinction that will arise below is the difference 
between the 3x3 dipole polarizability tensor. 


( 1 ) 




and the “isotropized” 
tity obtained via 


dipole polarizability, a scalar quan- 


a= |Tr[Q:] 


( 2 ) 


The Cartesian components of tensor quantities are in¬ 
dicated by superscript Latin indices ij, i.e., T*-^ is the 
(j,j)*^ component of the tensor T. Likewise, Cartesian 
unit vectors are indicated by {ej,ej}. Atom (or QHO) 
indices are denoted by subscript Latin indices abc. The 
index p will be used as a dummy index for summation. 
The imaginary unit is indicated with blackboard bold 
typeface, i, to distinguish it from the Cartesian compo¬ 
nent index i. Quantities that arise from the solution of 
the range-separated self-consistent screening (rsSCS) sys¬ 
tem of equations introduced by Ambrosetti et will 
be denoted by an overline, i.e., X —>■ X. For brevity we 
will refer to the MBD@rsSCS model (which has also been 
denoted as MBD* elsewhere) as simply MBD throughout 
the manuscript. 

The MBD model requires keeping track of several dif¬ 
ferent quantities that are naturally denoted with variants 
of the letter “R”, so we highlight these quantities here for 
the benefit of the reader. Spatial position, such as the 
argument of the electron density, p{r), is indicated by r. 
The nuclear position of an atom a (or QHO mapped to 
that atom) is indicated by Jla- The internuclear vector is 
denoted Raf, = !Ra — such that the internuclear dis¬ 
tance is given by Rat = IIR-afcH- R follows that the 
Cartesian component of this internuclear vector is 
Finally, the effective vdW radius of an atom a is indicated 
by 

The dependence of the long-range MBD correlation en¬ 
ergy, Ambd, on the underlying nuclear positions, {31} = 
IRq, Jlh, Ole, ■ ■ ■, will arise both explicitly through the pres¬ 
ence of internuclear distance terms, Rabi £oid implicitly 
through the presence of effective atomic volume terms, 
Va = 14 [{31}], obtained via the Hirshfeld partitioning^ 
of p{r) (see Sec. HBll. As such, these distinct types of 
dependence on the nuclear positions will be clearly de¬ 
lineated throughout the review of the MBD model and 
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the derivation of the corresponding MBD nuclear forces 
below. For notational convenience, we will often use dc 
rather than Vto indicate a derivative with respect to 
the nuclear position of atom c. 


B. Review of the many-body dispersion (MBD) 
model. 


The MBD formalism is based on a one-to-one mapping 
of the N atoms comprising a molecular system of interest 
to a collection of N QHOs centered at the nuclear coordi¬ 
nates, each of which is characterized by a bare isotropic 
frequency-dependent dipole polarizability, aa(ioj). De¬ 
rived from the electron density, i.e., Ua = aa[p(r)], 
these polarizabilities describe the unique local chemi¬ 
cal environment surrounding a given atom by account¬ 
ing for hybridization (coordination number), Pauli re¬ 
pulsion, and other non-trivial exchange-correlation ef¬ 
fects (see Sec. IIB 1[ ). To account for anisotropy in the 
local chemical environment as well as collective polar¬ 
ization/depolarization effects, the solution of a range- 
separated Dyson-like self-consistent screening (rsSCS) 
equation is used to generate screened isotropic frequency- 
dependent dipole polarizabilities for each QHO, IXa (see 
Sec. IIB2I. The MBD model Hamiltonian is then con¬ 


structed based on these screened frequency-dependent 
dipole polarizabilities. Diagonalization of this Hamilto¬ 
nian couples this collection of QHOs within the dipole 
approximation, yielding a set of interacting QHO eigen- 
modes with corresponding eigenfrequencies {A}. The 
difference between the zero-point energy of these in¬ 
teracting QHO eigenmodes and that of the input non¬ 
interacting modes ({w}), is then used to compute the 
long-ran ge corr elation energy at the MBD level of theory 
(see Sec. IIB31, i.e., 


3N 


i^MBD - n 


^ V- 


( 3 ) 


p=l 


a—1 


1. The MBD starting point: bare dipole polarizabilities. 


Mapping the N atoms comprising a molecular system 
of interest onto a collection of N QHOs is accomplished 
via a Hirshfeld partitioning^ of p{r), the ground state 
electron density. Partitioning p(r) into N spherical ef¬ 
fective atoms enables assignment of the bare frequency- 
dependent dipole polarizabilities Q:£,(iw) used to charac¬ 
terize a given QHO. Within the MBD formalism, this 
assignment is given by the following 0/2-order Pade ap- 
proximant applied to the scalar dipole polarizabilitiesl^ 


aa(iw) 


Qa(0) 

1 - {iuj/uJaf 


( 4 ) 


in which aa(0) is the static dipole polarizability and to a 
is the characteristic excitation (resonant) frequency for 


atom a. The dependence of the bare frequency-dependent 
dipole polarizability in Eq. Q on p(r) is introduced by 
considering the direct proportionality between polariz¬ 
ability and atomic volume,^ an approach that has been 
very successful in the Tkatchenko-Schefher (TS) disper¬ 
sion correction,^ i.e., 


aa[p(r)](0) 



/ / dr Wa{r)p{r)r^ \ 
V /drpfr«®(r)r3 ) 


a. 


free 


( 0 ), 


( 5 ) 

( 6 ) 


in which I//’’®® and a^®® are the volume and static dipole 
polarizability of the free (isolated) atom in vacuo, re¬ 
spectively, obtained from either experiment or high-level 
quantum mechanical calculations. Explicit dependence 
on p(r) resides in the effective “atom-in-a-molecule” vol¬ 
ume, Va[p{r)\, obtained via Hirshfeld partitioning^ of 
p(r) into atomic components, in which the weight func¬ 
tions. 


u;,(r)=pl'-®®(r)/y]pr‘^®(r), (7) 

b 


are constructed from the set of spherical free atom den¬ 
sities, {yO^‘'®®(r)}. At present, we compute the Hirsh¬ 
feld partitioning and subsequently the MBD energy and 
forces as an a posteriori update to the solution of the 
non-linear Kohn-Sham equations, i.e. without perform¬ 
ing self-consistent updates to p{r). Future work will ad¬ 
dress the impacts of computing the Hirshfeld partition¬ 
ing iterative!}^ and using the MBD potential to update 
the Kohn-Sham density self-consistently. In this regard, 
recent work on the self-consistent application of the TS 
method indicates that self-consistency can have a sur¬ 
prisingly large impact on the charge densities, and cor¬ 
responding work functions, of metallic surfaces,!^ so we 
anticipate that self-consistent MBD will be particularly 
interesting for the study of surfaces and polarizable low¬ 
dimensional systems. 

For later convenience, we rewrite Eqs. (|^ and ([^ to 
collect all quantities that do not implicitly depend on the 
nuclear coordinates through Vffpijc)] into the quantity 
Ta(ia;): 


aa[p(r)](ia;) = 


a 


.free 


( 0 ) 


= Taiiuj) Va[p{v)]. 


VaW)] ( 8 ) 

( 9 ) 


2. Range-separated self-consistent screening (rsSCS). 

Let A be a ZN x 3iV block diagonal matrix formed from 
the frequency-dependent polarizabilities in Eq. 

N 

A(ia;) = ^ = diag[ai, ck 2 , • ■ •, ckat]. (10) 

6=1 
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This quantity will be referred to as the hare system 
dipole polarizability tensor. For a given frequency, range- 
separated self-consistent screening (rsSCS) of A(itLi) is 
the n acco mplished by solving the following matrix equa- 
tioiJMIlll (gee the ESP^ for the detailed derivation of 
Eq. @): 


A = A - ATsr a 


^A= A 


-SR 


( 11 ) 

( 12 ) 


where TgR is the short-range dipole-dipole interaction 
tensor, defined below in Sec. IIB 4 Eq. (36l. The matrix 


A is the (dense) screened non-local polarizability matrix, 
sometimes called the relay matrix.ES 

Partial internal contraction over atomic sub-blocks of 
A yields the screened and anisotropic atomic polarizabil¬ 
ity tensors (the corresponding molecular polarizability is 
obtained by total internal contraction), he.. 


N 


aa(iw) = Aab(ia;). 


(13) 


6=1 


The static “isotropized” screened polarizability scalars, 
Q'a(O), that appear in the MBD Hamiltonian in Eq. (18) 


and Sec. IIB 3 below are then calculated from Q:a(0) via 


3. The MBD model Hamiltonian. 


The central concept in the MBD model is the Hamilto¬ 
nian for a set of coupled QHOs that each fluctuate within 
an isotropic harmonic potential U{xa) = and 

acquire instantaneous dipole moments, = Qa^a, that 
are proportional to the displacement, x^, from the equi¬ 
librium position and charge, on each oscillator. This 
Hamiltonian defines the so-called coupled fluctuating 
dipole model (CFDM),l^and is given by: 


N 


N 


dlTa6d6 


^ 1 V2 

'HcFDM = “ X/ 9 —^ 

^ 2 rna ^ 

(17) 

where T^b is the dipole-dipole interaction tensor that 
couples dipoles a and b. 

In the range-separated MBD model,!^ T is replaced 
by a long-range screened interaction tensor, T lr (as 
defined in Sec. HB4 and Eq. 


below), and the 
fluctuating point dipoles are replaced with the Gaus¬ 
sian charge densities of QHOs, with effective masses 
vria = (aa(0) obtained from their respective static 

polarizabilities and excitation frequencies. The corre¬ 
sponding range-separated MBD model Hamiltonian is 
therefore!^ 


Q;a(0) = -Tr 


aa(0) 


(14) 


as described above in Eq. (|^. Note that Eqs. ( [I2p^ can 
be solved at any imaginary frequency, iw, so we do not 
require the Fade approximant given in Eq. (|^ to boot¬ 
strap from aa(0) to 0 : 0 ( 110 ;). However, the relationship 
between uJa and Ce^aa, given in Eq. (16), is one that is 
derived from the Fade approximant for the bare polariz¬ 
ability 0 ( 110 ;). 

In the non-retarded regime, the Casimir-Folder inte¬ 
gral relates the effective Cg^ab dispersion coefficient to the 
dipole polarizabilities of QHOs a and b via the following 
integral over imaginary frequencies!^ 


Gg oh = — 
TT 


do; Oa(io;)ot,(iio;). 


(15) 


By solving Eqs. ( I2p^ on a grid of imaginary frequencies 
{ij/p}, a set of screened effective Gg coefficients, {Gg}, can 
be determined by a Gauss-Legendre quadrature estimate 
of the integral in Eq. (15). The screened QHO charac¬ 


teristic excitation frequency, uJa-, is then calculated as 


_ 4 G, 

Wa = - 


6,aa 


3 [Oa(0)]2 TT 




«a(iyp) 

ao(0) 


n 2 


(16) 


where gp and yp are the quadrature weights and abscis¬ 
sae, respectively. Scaling of the usual Gauss-Legendre 
abscissae from [—1,1] to the semi-infinite interval [0,(x;) 
is discussed in the accompanying ESI.I^ 


Hmbd = - + yZ 

a—1 a—1 

N 


a>b 


in which fia = y/'^a IS the mass-weighted dipole 
moment!^ of QHO a that has been displaced by from 


its equilibrium position. The first two terms in Eq. (18) 


represent the kinetic and potential energy of the individ¬ 
ual QHOs, respectively, and the third term is the two- 
body coupling due to the long-range dipole-dipole inter- 


r LR 


action tensor, , defined below in Eq. (38). 

By considering the single-particle potential energy and 
dipole-dipole interaction terms in Eq. (18), we can con¬ 


struct the 3N X 3N MBD interaction matrix, which is 
comprised of 3 x 3 subblocks describing the coupling of 
each pair of QHOs a and b: 


iMBD 


: LR 


-'ab — 4” (^ aV^b \/O:a(0)o:b(0) ab ; 


(19) 


where Sab is the Kronecker delta between atomic indices. 

The eigenvalues {Ap} obtained by diagonalizing 
correspond to the interacting (or “dressed”) QHO modes, 
while uJa correspond to the modes of the non-interacting 
reference system of screened oscillators. The MBD cor¬ 
relation energy is then evaluated via Eq. ^ as the zero- 
point energetic difference between the interacting and 
non-interacting modes. 

For periodic systems, all instances of the dipole-dipole 
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interaction tensor would be replaced by 

Tab Tab + Tab' (20) 

b' 


where the sum over b' indicates a lattice sum over the 
periodic images of atom b. Since this is an additive 
modification of T, it will not qualitatively modify the 
expressions for the analytical nuclear derivatives of the 
MBD energy. Hence, the derivation of the nuclear forces 
presented herein (and the accompanying chemical ap¬ 
plications) will focus on non-periodic (or isolated) sys¬ 
tems. We note in passing that the current implementa¬ 
tion of the MBD energy and nuclear forces in Quantum 
ESPRESSO (QE)I^ is able to treat both periodic and 
non-periodic systems. In this regard, a forthcoming pa- 
peiES will describe the details of the implementation and 
discuss the subtleties required to make the computation 
of well-converged MBD nuclear forces efficient for peri¬ 
odic systems. 


4- The range-separated dipole-dipole interaction. 

Prior to range-separation, the 3x3 sub-block Tab of 
the dipole-dipole interaction tensor T, which describes 
the coupling between QHOs a and b, is defined as: 


where aa(iw) = |Tr[Q;J is the “isotropized” bare dipole 


polarizability and Eq. (|9| was used to make the effective 
volume dependence more explicit. 

The Cartesian components of the dipole-dipole inter¬ 
action tensor in Eq. (21) (with all QHO indices and 
frequency-dependence of C suppressed) are given by: 


T*^(iw) = 


erf [C] - ^ exp 


rp I'J 

-L Ai 


dip 


(27) 


4 

t/tt R^ 


C"exp[-C2], 


where i?® = Rab • is the Cartesian component of 
Rab, and Tjip is the frequency-independent interaction 
between two point dipoles: 


r dip 


-3R®RJ -k R^S,j 
R5 


(28) 


with Sij indicating the Kronecker delta between Carte¬ 
sian indices. 

The range-separation of the dipole-dipole interaction 
tensor is accomplished by using a Fermi-type damping 

function,EH™l 


/(^ab) 


1 -k exp [-Zab] 


-1 


(29) 


Tab = 0 Vat^^Uab, (21) 

where Vab is the frequency-dependent Coulomb inter¬ 
action between two spherical Gaussian charge distribu¬ 
tions.^ This frequency-dependent interaction arises due 
to the fact that the ground state of a QHO has a Gaus¬ 
sian charge density: 


tu ■ \ erf [Cab (iw)] 

Tab(Rab,lw) = 

■^ab 

(22) 

where Rab = H^la - 3^b||, 


= ^ab 

(23) 

and 


Eab(iw) = \/cra(iw)2 -k tTb(iw)2 

(24) 


is the effective correlation length of the interaction po¬ 
tential defined by the widths of the QHO Gaussians (see 
Eq. ( 25 ), below). As such, the dependence of T on both 
the frequency and (implicitly) on the nuclear coordinates 
originates from Eab(iw) (see also Eqs. (§-(§). 

In terms of the bare dipole polarizability, the width of 
the QHO ground-state Gaussian charge density is given 
by: 


which depends on Zab, the ratio between Rab, the inter- 
nuclear distance, and Sab, the scaled sum of the effective 
vdW radii of atoms a and b, TZ'^a^ and 


Zab = 6 


Rab 

^b 


- 1 


^ab = P [n 


vdW 




vdWl 


(30) 

(31) 


Here, the range-separation parameter /3 is fit once for a 
given exchange-correlation functional by minimizing the 
energy deviations with respect to highly accurate refer¬ 
ence data.l^ The short- and long-range components of 
the dipole-dipole interaction tensor in Eq. (271 are then 
separated according to: 


Tsr = [1 - fiZ)] T 


(32) 


and 


Tlr = fiZ)T. 


(33) 


However, at long-range, the frequency-dependence in T 
dies off quickly, so when evaluating the MBD Hamilto¬ 
nian we replace Eq. (33) with the approximation 


Tlr - /(^)Tdi, 


(34) 


(Ta(iw) 





(25) 

(26) 


which is equivalent to taking erf [Q ~ 1 and exp[—~ 
0 in Eq. (271 and (331. This has the added benefit of 


improved computational efficiency since special functions 
such as the error function and exponential are relatively 
costly to compute. As shown in Fig. I^in the ESI,^ these 
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approximations are exact to within machine precision for 
(^ > 6, and thus in practice by the time f{Z) has obtained 
a substantial value, the frequency dependence in T has 
vanished, thereby justifying Eq. ([34|. 

The rsSCS procedure described in Sec. |II B 2| adds a 
further subtlety in that it modifies the effective vdW radii 
in the definition of the Sab and Zab quantities above (see 
Refs. I46I52I for a more detailed discussion of these def¬ 
initions). For the short-range interaction tensor (he., 
the tensor used in the rsSCS procedure) the damping 
function utilizes effective vdW radii calculated at the 
Tkatchenko-Schefher (TS) levelP^ 

.^vdW.TS _ ( 

a “ i ^free 


1/3 




vdW, free 


(35) 


where free is the free-atom vdW radius defined in 

Ref.|29]using an electron density contour, not the BondP^ 
radius that corresponds to the “atom-in-a-molecule” ana¬ 
log of this quantity. To indicate that the TS-level ef¬ 
fective vdW radii are being used, the argument of the 
damping fun ction for the short-range interaction tensor, 
used in Eqs. (|lll|l2 ), will be denoted with Z"^^ (cf. Eqs. 
@|3^): 


Tsr 


1 - / (Z^S) 


T. 


(36) 


C. Derivation of the MBD nuclear forces. 

With the above definitions in hand, we are now ready 
to proceed with the derivation of the analytical deriva¬ 
tives of the MBD correlation energy with respect to the 
nuclear (or nuclear) position 5lc of an arbitrary atom c. 
These MBD forces are added to the DFT-based forces. 
As mentioned above in Sec. |II A[ two distinct types of 
nuclear coordinate dependence will arise: explicit depen¬ 
dence through Rat, = Jla — Olb and implicit dependence 
through E[{1R}] (as moving a neighboring atom c will 
slightly alter the effective volume assigned to atom a). 
Future work will address the effects of the MBD con¬ 
tribution to the exchange-correlation potential when ap¬ 
plied self-consistently, which will ultimately impact p{r). 
Our current work neglects these effects, and computes 
MBD as an a posteriori correction to DFT, i.e., non- 
self-consistently. 

Having carefully separated out the implicit dependence 
on F[{31}] in the relevant quantities above, the derivation 
proceeds largely by brute force application of the chain 
and product rules. The derivative of the MBD correlation 
energy given in Eq. is governed by: 

1 3Ar g AT 

dcEymio = 2 ^ dc^/X^ — - ^ dctOa, (39) 

p—1 a—1 


For the long-range dipole-dipole interaction tensor used 
in the MBD Hamiltonian in Eq. (181, the damping func¬ 
tion utilizes the self-consistently screened effective vdW 
radiiP 


:^vdw ^ / a. 

/ V ^ - 


■free 


(o^\ 


1/3 


(0); 


7^ 


vdW, free 


(37) 


wherein the ratio a(0)/a^'^®®(0) takes the place of F/Rf''®® 
thereby still exploitin g the proportionality between po¬ 
larizability and volume .^S^mixo indicate that the screened 
effective vdW radii are being used, the argument of the 
damping function for the long-ra nge intera ction tensor 
will be denoted with Z (cf. Eqs. (37 30 ]|m l): 


Tlr = /(Z)T, 


dip • 


(38) 


This dependence on Z is why we use an overline on Tlr 
above, and in Eqs. ( fl^ . 


hence requiring derivatives of the screened excitation fre¬ 
quencies, uJa, as well as the eigenvalues, Ap, of the 
matrix. Since is real and symmetric, it has 3N 

orthogonal eigenvectors. We therefore do not concern 
ourselves here with repeated eigenvalues (see the ESP^ 
for a more detailed discussion) and take derivatives of Ap 
asP 


^c\l Ap — 

^cXp — 


dcXp 


pp 


N 


P =1 


Tr 


x^ X 


(40) 

(41) 

(42) 


where X is the matrix of eigenvectors of and 

A = diag[Ap] is the diagonal matrix of eigenvalues. To 
evaluate this last line we require the derivative of the ab 
block of (cf. Eq. ^), 


drC 


MBD 

ab 


— (1 ^ab) [^a^c^b ^b^c^a] '\/Q;a(0)Q^b(0) ab 


LR 


(43) 


, ,_ [Oa (0) 0cah(O) -f Of, (0) 0cQ!a (0)] =LR 

+ — Oab) - , . . - J-^ 


2^/aJfi)abW) 


ab 


(1 - Sab) UJaUJbV^a{0)ab{0) OcT 


7 ^ LR 


ab 


To proceed any further we now need the derivatives of oj, 
O', and Tlr. From Eq. (161, we find that the derivative 


of the screened excitation frequency, ui, requires us to 
evaluate derivatives of □-(iia;) (with Q'(O) as a specific case) 
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as follows: 

dr_UJn 




9p 




aa(iyp)dcaa{iyp) 

[«a(0)]2 

[aaiiyp)f dcaaiO) 
[«a(0)]3 


(44) 


The derivative of the screened polarizability, a, Eq. (141 


is calculated from the “isotropized” partial contraction of 
A (with the frequency dependence suppressed): 


dc^a = 


N 




ab 


Lfe=l 


(45) 


Using Eq. (|12|) and (|36|) and expanding the derivative of 

(46) 


the inverse of a non-singular matrix, we have 
deA = -A [-A-1 [a^A] A-1 + ScTsh] A. 
Using Eqs. and ( [Tq| , we compute dcA as: 


N 


aeA = 0diag[T, d,Va]- 


(47) 


a=l 


In Eq. (471 we have terminated the chain-rule with dcVa, 


which has remaining implicit dependence on the nuclear 
coordinates. We regard dcVa as one of our three fun¬ 
damental derivatives since the Hirshfeld partitioning is 
typically computed separately from the rest of the MBD 
algorithm. Discussion of how to compute dcVa may be 
found in the ESI.I^ 

In considering the derivatives of the dipole-dipole in¬ 
teraction tensors, we will encounter both implicit and ex¬ 
plicit nuclear position dependence through ^af,, Eq. (23l. 


The derivatives of Tsr, Eq. (361, and Tlr, Eq. (381, are 
fairly complicated, so it will help to consider first the 
damping function, /, in isolation. Here, 


a 1- exp[-Za6] 

dcf[Kab) - -- - —■■■-2 

[1 -h exp [-Zab\\ 


dcZ, 


ab: 


dr.Zah = 6 


dc^ab ^abdc^ab 
Sab 


Sib 


daSab = P [dJVf^ + dalVt^] 
where dcRab is calculated as 


dcRab = VaiJIRabll = [5ac — 5bc) 


IR 


■ab 


(48) 

(49) 

(50) 

(51) 


and the effective vdW radii have only implicit nuclear co¬ 
ordinate dependence. For the gradient of Tsr, Eq. (361, 


we require the derivative of the TS-level effective vdW 
radii, Eq. (351: 


Q^vdW,TS^ 


^vdW.free 




11/3 


3[K] 


2/3 ’ 


(52) 


while for the gradient of Tlr, Eq. (38 1, we require the 

rad: 

dc^aifi) 


derivative of the screened effective vdW radii, Eq. (371: 

■^vdW, free 


drTl „ = 


4-(0)]'/^3[a,(0)]"/^’ 


(53) 


which was evaluated using Eqs. (|45[)-(47 1. 


In the following we suppress the a, 6, c QHO indices 
where possible so that the Cartesian indices i, j are high¬ 
lighted. First we consider the derivative of T^jp, Eq. (281, 
which is given by: 


9T*fp = -3 




R3dR^ + R^dRJ 5ffi?i 


dR , 

(54) 


R^ 

where dR^ is evaluated as: 

dcRlb = Vai. ((X - IRb) • eO = {5ac - Sbc)e^. (55) 


Since the long-range dipole-dipole interaction tensor 
is approximated with the frequency-independent Tdip 
(thereby eliminating (), Eqs. (48)-(53l and (54) provide 
us with all of the quantities needed to evaluate ^cTlr 
as: 

dJrX LR = T:i dip dj (Zab) + f (Zab) daT^ dip- 

(56) 

The derivative of Tsr is more complex since T depends 
on C: 

dcKl SR = dJ {Zl^) + [1 - / (ZJS)] daTZ, 

(57) 

in which the derivative of jg given below (see the 
Espni for a detailed derivation): 


dTd = -3 


erf [C] - 


MO 
2C J 




(58) 


CMO 


— Ear 
3 'I'P R^ 


r R^R^ 

r 1 ■ 

dip ^5 

3- 2M 


MO 90 


wherein we have defined the following function for com¬ 
pactness. 


hi(ab) = ^exp[-Cp. 


(59) 


The derivative of (ab is given by (with QHO indices re¬ 
stored to express dcZiab from Eq. (11): 


a /■ Cab n Cab + CTbdc(Jb\ 

^c^ab — 23 ^c^ab t^2 ' 

^ab 


(60) 


^ab 


where dcCTa is computed from Eq. (261 as 

-I 1/3 

^ I ‘Z 

dcCTa = 


MM- 


dcVa 


3[K] 


2/3 ■ 


(61) 


























































We have now reduced the analytical nuclear deriva¬ 
tive of the MBD correlation energy to quantities that 
depend on three fundamental derivatives: d^Rab^ ^cRab 
and dcVa- The expressions for dcRab and dcR\i, have 
been given above in Eqs. and ( |5^ , and are straight¬ 
forward to implement. The computation of dcVa is out¬ 
lined briefly in the ESI.^ 


III. COMPUTATIONAL DETAILS 


ular interactions (buckyball catcher host-guest complex). 
We subsequently examined the importance of the implicit 
nuclear coordinate dependence that arises from the Hir- 
shfeld effective volume gradient dV in the computation 
of the MBD forces. 


A. Intermolecular interactions: stationary points 
on the benzene dimer potential energy surface. 


We have implemented the MBD energy and analyti¬ 
cal nuclear gradients (forces) in a development version 
of Quantum ESPRESSO v5.1 (QE).I^A forthcoming 
publication will discuss the details of this implementa¬ 
tion, including the parallelization and algorithmic strate¬ 
gies required to make the method efficient for treating 
large-scale condensed-phase systems .1^ 

All calculations were performed with the Perdew, 
Burke, and Ernzerhof (PBE) exchange-correlation 
functional,!^ and Hamann-Schlueter-Chiang-Vanderbilt 
(HSCV) norm-conserving pseudopotentials.!^ As a point 
of completeness, it should be noted that in QE the Hir- 
shfeld partitioning has only been implemented for norm- 
conserving pseudopotentials, and thus the MBD method 
cannot presently be used with ultrasoft pseudopotentials 
or projector-augmented wave methods. To ensure a fair 
comparison with our implementation of the MBD model, 
all TS calculations were performed as a posteriori cor¬ 
rections to the solution of the non-linear Kohn-Sham 
equations, i.e. we turned off the self-consistent density 
updates from TS. Additional computational details, in¬ 
cluding detai led co nvergence tolerances and basis sets are 
given in Sec. VIIJ of the ESI.For comparis on wi th the 
D3(BJ) dispersion correction of Grimme et (here¬ 

after abbreviated as D3) we also optimized structures 
using Orca v3.03.!^ We used the atom-pairwise version 
of D3(BJ) since only numerical gradients were available 
for the three-body term. 


IV. RESULTS AND DISCUSSION 

To verify our implementation of the MBD energy in 
QE, we compared against the impleme ntatio n of the 
MBD@rsSCS model in the FHI-AIMS codd^HH] and And 
agreement to within 10“^^ Ej,. We next verified our im¬ 
plementation of the analytical gradients by computing 
numerical derivatives via the central difference formula 
and And agreement within the level of expected error 
given the finite spacing between the grid points describ¬ 
ing p(r) and error propagation of finite differences of the 
Hirshfeld effective volume derivatives. 

To demonstrate the efficiency and accuracy of the an¬ 
alytical MBD nuclear gradient, we performed geometry 
optimizations on representative systems for intermolecu¬ 
lar interactions (benzene dimer), intramolecular interac¬ 
tions (polypeptide secondary structure), and supramolec- 


As the prototypical example of the tt — tt interaction, 
there have been a large number of theoretical studies 
on the benzene di mer u sing very high-level wavefunction 
theory methods.fSShllil gj^ce the intermolecular attrac¬ 
tion between the benzene dimer arises primarily from a 
balance between dispersion interactions and quadrupole- 
quadrupole interactions (depending on the intermolecu¬ 
lar binding motif), the interaction energy is quite small 
(^2 — 3 kcal/mol) and the potential energy surface 
(PES) is very flat. Consequently, resolving the stationary 
points of this PES is quite challenging for both theory 
and experiment. The prediction of the interaction en¬ 
ergy in the benzene dimer represents a stringent test of 
the ability of a given electronic structure theory method 
to capture and accurately describe non-bonded inter¬ 
molecular interactions. Historically, three conformers of 
the dimer have received the most attention, namely the 
“sandwich,” “parallel-displaced,” and “T-shaped” struc¬ 
tures. Using the high-level benchmark interaction energy 
calculations as a guide, several studies have used a vari¬ 
ety of more approximate m ethods to examine the PES 
more broadly.Q^lIinSIIlIIIill Ry scanning the PES of the 
benzene dimer with DFT-based symmetry ada pted per¬ 
turbation theory (DFT-SAPT), Podeszwa et iden¬ 
tified 10 stationary points, i.e., either minima (M) or 
saddle points (S) of the interaction energy (see Fig. [^. 
Most wavefunction studies of the benzene dimer PES 
have used a fixed monomer geometry, assuming that the 
weak interactions will produce very little relaxation of 
the rigid monomer .ESIl Using the highly accurate fixed 
benzene mon ome r geometry of Gauss and Stanton,!^ 
Bludsky et performed counterpoise-corrected ge¬ 

ometry optimizations of these 10 configurations at the 
PBE/GGSD(T) level of theory, with an aug-cc-pVDZ ba¬ 
sis set. The resulting geometries are among the largest 
molecular dimers to be optimized with a GGSD(T) cor¬ 
rection to date and represent the most accurate available 
structures for the dimer of this classic aromatic system. 

As a first application of the MBD analytical nuclear 
gradients derived and implemented in this work, we 
performed geometry optimizations on these 10 benzene 
dimer configurations at the PBE+MBD, PBE+TS, and 
PBE+D3 levels of theory. All of the geometry opti¬ 
mizations performed herein minimized the force com¬ 
ponents on all atomic degrees of freedom according 
to t he thr esholds and convergence criteria specified in 
Sec. [Vnj] of the ESpl {i.e., frozen benzene monomers 
were not employed in these geometry optimizations). 
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FIG. 1. Top: Graphical depictions of the lOcoiifigurations that correspond to stationary points on the benzene dimer PES, 
following the nomenclature of Podeszwa et (Mn = minima; Sn = saddle points). Left: Ghange in inter-monomer distance, 

R, relative to the PBE/CGSD(T) reference for geometries optimized with PBE+vdW methods: MBD (shown in blue), TS 
(shown in yellow) and D3 (shown in green). PBE+MBD consistently predicts the correct inter-monomer distance. For the 
stacked configurations (Ml, S4, S7, and S8) PBE+TS shortens the inter-monomer distance, while for T-shaped configurations 
(M2, SI, S2, S3, S5, and S6) the inter-monomer distance is elongated. For all configurations except the stacked S7 and S8 
structures PBE+D3 predicts too longan inter-monomer distance. Right: Root-mean-square-deviations (RMSD) in A between 
the PBE+vdW and PBE/CCSD(T)p® optimized geometries of these 10 benzene dimer configurations. The RMSD between 
the PBE+MBD and reference PBE/GCSD(T) geometries (shown in blue) are uniformly small and consistent across all minima 
and saddle points on the benzene dimer PES. For several Mn and Sn configurations, the PBE+D3 optimized geometries (shown 
in green) agree quite well with the PBE/GGSD(T) reference, while the PBE+TS optimized geometries (shown in yellow) have 
more significant deviations. 


The root-mean-square-deviations (RMSD in A) between 
the PBE+MBD, PBE+TS, and PBE+D3 optimized ge¬ 
ometries with respect to the reference PBE/CCSD(T) 
results are depicted in Fig. [l] 

From this figure, it is clear that the PBE+MBD 
method, with a mean RMSD value of 0.01 A (and a van¬ 
ishingly small standard deviation of 3 x 10“^ A) with 
respect to the reference PBE/CCSD(T) results, was able 
to provide uniformly accurate predictions for the geome¬ 
tries of all of the benzene dimer configurations consid¬ 
ered. These findings are encouraging and consistent with 
the fact that the PBE+MBD method yields significantly 
improved binding energies for the benzene dimer as well 
as a more accurate quantitative description of the frac¬ 
tional anisotropy in the static dipole polarizability of the 
benzene monomer .1^ This is also consistent with the find¬ 
ing of von Lilienfeld and Tkatchenko that the three-body 
ATM term contributes ~ 25% of the binding energy of 
the ben zene dimer in the parallel displaced configura- 
tion.inii 

With a mean RMSD value of 0.03 ± 0.01 A and 
0.05 ± 0.02 A respectively, the PBE+D3 and PBE+TS 
methods both yielded a less quantitative measure of the 
benzene dimer geometries with respect to the reference 
PBE/CCSD(T) data. Of the 7 benzene dimer configura¬ 
tions for which the PBE+TS RMSD values were greater 
than 0.05 A (namely M2, SI, S3, S4, S6, S7, and S8), it is 


difficult to identify a shared intermolecular binding motif 
among them. Interestingly, PBE+D3 seems to fare bet¬ 
ter on sandwiched geometries and it is only the T-shaped 
S4 and S6 which have RMSDs above 0.05 A. 

However, analysis of the inter-monomer distance (see 
Fig. 0 reveals that PBE+TS tends to shorten the inter¬ 
monomer distance R for sandwich geometries (Ml, S4, 
S7, and S8) by an average of 0.03 A relative to the 
PBE/CCSD(T) results, while it elongates the inter¬ 
monomer distance by an average of 0.09 A for T- 
shaped structures. The dispersive interaction between 
the stacked structures (S7 and S8) is stronger than that 
of the parallel displaced structures (Ml and S4), so 
PBE+TS shortens the inter-monomer distance more sig¬ 
nificantly for S7 and S8. Likewise, these are the only 
two structures for which PBE+D3 shortens the inter¬ 
monomer distance. For all other geometries PBE+D3 
elongates the inter-monomer distance by an average of 
0.06 A. For both sandwich and T-shaped structures, 
PBE+MBD performs much more consistently, elongat¬ 
ing the inter-monomer distance by a scant 5 x 10“^ A 
and 1 X 10“^ A for sandwich and T-shaped configura¬ 
tions, respectively. 

We note that RMSD values in the range of 0.03-0.08 A, 
and errors on the inter-monomer distances of 0.05-0.15 A, 
in the geometries of small molecular dimers (as found 
here with the PBE+TS and PBE+D3 methods) are not 
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unacceptably large in magnitude; however, these differ¬ 
ences will become even more pronounced as the sizes and 
polarizabilities of the monomers continue to increase. In 
this regime, the MBD method—by accounting for both 
anisotropy and non-additivity in the polarizabilities as 
well as beyond-pairwise many-body contributions to the 
long-range correlation energy—is expected to yield accu¬ 
rate and consistent equilibrium geometries for such sys¬ 
tems. As such, the combination of DFT+MBD has the 
potential to emerge as a computationally efficient and ac¬ 
curate electronic structure theory methodology for per¬ 
forming scans of high-dimensional PESs for molecular 
systems whose overall stability is primarily dictated by 
long-range intermolecular interactions. 


B. Intramolecular interactions: secondary 
structure of polypeptides. 


As a second application, we considered the intramolec¬ 
ular interactions that are responsible for the secondary 
structure in small polypeptide conformations. In par¬ 
ticular, we studied 76 conformers of 5 isolated polypep¬ 
tide sequences (GFA, FGG, GGF, WG, and WGG), 
which are comprised of the following four amino acids: 
glycine (G), alanine (A), phenylalanine (F), and trypto¬ 
phan (W). This set of peptide building blocks includes 
the simplest amino acids, glycine and alanine (with hy¬ 
drogen and methyl side chains, respectively), as well as 
the larger aromatic amino acids, phenylalanine and tryp¬ 
tophan (with benzyl and indole side chains, respectively). 
Although each of these polypeptides are relatively small 
(with 34-41 atoms each), a significant amount of confor¬ 
mational flexibility is present due to the non-trivial in¬ 
tramolecular binding motifs found in these systems, such 
as non-bonded side chain-backbone interactions and in¬ 
tramolecular hydrogen bonding. In fact, it is the pres¬ 
ence of these interactions that leads to the formation of 
a-helices and /3-pleated sheets—the main signatures of 
secondary structure in large polypeptides and proteins. 

Following a benchmark study by Valdes et al.^^ in 
which the geometries of these 76 conformers were op- 
timize d u sing second-order Mpller-Plesset perturbation 
theorjlii^ ( MP2) wi thin the resolution-of-the-identity ap- 
proximatioiiii^Wi^ (RI-MP2) and the fairly high-quality 
cc-pVTZ atomic orbital basis setwe performed ge¬ 
ometry optimizations on this set of conformers with sev¬ 
eral vdW-inclusive DFT approaches, namely, PBE+D3, 
PBE+TS, and PBE+MBD. All of the geometry opti¬ 
mizations performed in this section minimized the force 
components on all atomic degrees of freedom according 
to the thres holds and convergence criteria specified in the 
ESP^ Sec. VIIJ Treating the MP2 geometries as our 
reference. Fig. [2 displays box-and-whisker plots of the 
distributions of root-mean-square deviations (in A) ob¬ 
tained from geometry optimizations employing the afore¬ 
mentioned vdW-inclusive DFT methodologies. 

Here we find that the PBE+MBD method again yields 


equilibrium geometries that are consistently in signifi¬ 
cantly closer agreement with the reference MP2 data 
than both the PBE+TS and PBE+D3 methodologies. 
For instance, the RMSDs between the PBE+MBD and 
MP2 conformers are smaller than 0.12 A for all but one 
GGF conformer (34: GGF04), with an overall mean 
RMSD value of 0.07 ± 0.03 A. In contrast to the in¬ 
termolecular case of the benzene dimer, the PBE+TS 
method performs significantly better than PBE+D3 on 
the same benchmark set of polypeptides, with overall 
mean RMSD values of 0.11 ± 0.07 A and 0.20 ± 0.17 
A, respectively. In this regard, the whiskers in Fig. 
extend to RMSD values that are within 1.5 times the 
interquartile range {i.e., following the original, although 
arbitrar y, co nvention for determining outliers suggested 
by TukejE^ , which highlights the fact that there are sev¬ 
eral conformers for which both PBE+TS and PBE+D3 
predict equilibrium geometries that are significantly dif¬ 
ferent than MP2. 

Although MP2 is the most economical wavefunction- 
based electronic structure method that can describe dis¬ 
persion interactions, MP2 tends to grossly overestimate 
Cq dispersion coefficients and hence the binding ener¬ 
gies o f dis persion-bound complexes such as the benzene 
dimer.ll^ Since PBE+MBD should bind less strongly 
than MP2, we expect the side-chain to backbone dis¬ 
tance to elongate slightly for bent conformers. Gonform- 
ers where the side chain is extended away from the back¬ 
bone are expected to show less deviation between MP2 
and PBE+MBD as the side-chain to backbone dispersion 
interaction will be less significant in determining the ge¬ 
ometry of the conformer. 

Aside from the noticeable outliers, the structural de¬ 
viations in most of the conformers correspond to small 
rotations or deflection of terminal groups and side chains 
due to dispersion-based interactions, in contrast to the 
backbone which is constrained by non-rotatable bonds. 
In Fig. I^we present representative overlays of this rear¬ 
rangement, showing the MP2 (blue), PBE+MBD (red), 
and PBE+D3 (yellow) geometries. In a) structure 17 
(GFA03) is a conformer for which both PBE+MBD and 
PBE+D3 give small/moderate RMSDs with MP2. Both 
PBE+MBD and PBE+D3 open the cleft between the 
alanine and phenylalanine, also causing the amine on the 
backbone to slightly rotate. The relative positioning of 
these structures is expected, given the tendency of MP2 
to over-bind dispersion interactions and the tendency of 
PBE+D3 to under-bind. In b) structure 48 (WG03), 
again shows PBE+MBD agreeing well with MP2, but 
slightly opening the backbone-side chain distance. How¬ 
ever, PBE+D3 is disastrous for this structure, yielding 
an RMSD of 1.10 A due to large rotations in both the 
backbone and indole side-chain. 

Structures where the side-chain lies farther off to the 
side of the backbone, such as 4 (FGG215) shown in panel 
d), show the smallest RMSDs between the PBE+MBD 
and reference MP2 geometries with the PBE+MBD ge¬ 
ometry lying almost exactly on top of the MP2 geom- 
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FIG. 2. Box-and-whisker plots showing the distribu¬ 
tion of root-mean-square-deviations (RMSDs) in A between 
76 conformers of 5 isolated small peptides optimized with 
PBE+MBD (blue), PBE-fTS (yellow) and PBE+D3 (green) 
compared against the MP2 reference geometries of Ref. 11161 
Whis kers extend to data within 1.5 times the interquartile 
range.l^^ Note the need for a broken axis to show the largest 
RMSDs of PBE+D3. PBE+MBD consistently outperforms 
both PBE+TS and PBE+D3 in terms of yielding optimized 
geometries closer to the MP2 reference. Median (maximum) 
values are: 0.06 (0.28) A for PBE+MBD, 0.09 (0.52) A for 
PBE+TS, and 0.14 (1.10) A for PBE+D3. 


etry. However, FGG215 is again a structure where D3 
does poorly with respect to the MP2 geometry, this time 
rotating the benzyl side-chain away from the terminal 
glycine, yielding an RMSD of 0.64 A. 

The structure for which the PBE+MBD method has 
the largest RMSD, at 0.28 A, is 34 (GGF04), shown in 
panel c). As opposed to opening a cleft like in GFA03, 
PBE+MBD rotates the phenylalanine and alanine groups 
together. This rotation occurs because the terminal hy¬ 
drogen on the glycine is attracted to the 7r-system on the 
phenylalanine. The rigid nature of the glycine combined 
with the rotatable bond in the phenylalanine, forces the 
phenylalanine to slightly rotate in response. The mo¬ 
tion of the middle glycine solely attempts to minimize 
molecular strain from these other two interactions. Both 
PBE+TS and PBE+D3 methods show a similar rotation 
for this structure, though PBE+D3 rotates the structure 
even farther than PBE+MBD. This concerted rotation 
is associated with a very flat potential energy surface, as 
indicated by the fact that a second optimization run with 
the same tolerances resulted in a slightly greater rotation. 

Following Valdes et al., we classified the structures 
by the existence of an intramolecular hydrogen-bond 
between the —OH of the terminal carboxyl group 
and the G=0 group of the preceding residue. The 
mean RMSD is strongly influenced by the high out¬ 
liers, so the median RMSD is a more representative 
measure for comparing these two groups of conformers. 
The median RMSD for G 02 Hfree (G 02 Hbonded) struc¬ 


tures is: 0.06 (0.07) A for PBE+MBD, 0.09 (0.09) A for 
PBE+TS, and 0.14 (0.14) A for PBE+D3. Overall, we 
And that the presence of this intramolecular hydrogen 
bond does not strongly correlate with which structures 
deviate more from the MP2 geometries. This finding was 
somewhat unexpected since Valdes et al. asserted that 
dispersion interactions are more important in determin¬ 
ing the structure of the GO 2 Hf 1 .ee family of conformers 
due to tendency of the peptide backbone to lie over the 
aromatic side chain. 

Overall, we And excellent agreement between the MP2 
and PBE+MBD geometries. Where PBE+MBD devi¬ 
ates, we And agreement with physical and chemical in¬ 
tuition when we take into account the well known over¬ 
binding for dispersion interactions present in MP2. The 
agreement between PBE+MBD and MP2 geometries is 
in marked contrast to the inconsistent performance of 
PBE+D3 and PBE+TS, which both yielded numerous 
outliers. Although computational cost is not directly 
comparable between a Gaussian-type-orbital code and a 
planewave code, we are greatly encouraged by the ac¬ 
curacy of our PBE+MBD geometry optimizations since 
such calculations with a generalized gradient approxima¬ 
tion functional like PBE are substantially cheaper than 
with RI-MP2. 


C. Supramolecular interactions: the buckyball 
catcher host—guest complex. 

Noncovalent interactions are particularly important 
in supramolecular chemistry, where non-bonded inter¬ 
actions, including dispersion, stabilize molecular assem¬ 
blies. The large size of supramolecular host-guest com¬ 
plexes typically places them outside the reach of high- 
level quantum chemical methodologies and necessitates 
the use of DFT for geometry optimizations and energy 
computations. However, the large polarizable surfaces 
that interact in these systems requires a many-body 
treatment of dispersion to achieve a chemically accu- 
rate description of supramolecular binding energies 
The Ggg “buckyball catcher” host-guest complex (also 
referred to as Ggg@GgQH 2 g) in particular has received 
considerable attention as a benchmark supramolecular 
system in the hope that it is prototypical of dispersion- 
driven supramolecular systems, an d it ha s been stud¬ 
ied e xtensively both experimentall>U^^Jtl 28 | theoret- 
icQXly\S§nMn2§n2MIEM buckyball catcher (de¬ 

noted as 4a by Grimme) is one of the most well stud¬ 
ied members of the S12L t est set of noncovalently bound 
supramolecular complexes . 1 ^^ 

Much of the past computational work has focused on 
modeling the interaction energy of the Ggg buckyball 
catcher and comparing these results to the experimental 
data on thermodynamic association consta nts that have 
been extracted from titration experiments li2SHIIZ! This 
complex is a challenging system for most dispersion cor¬ 
rection methods since the three-body term contributes 
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FIG. 3. Overlays of the structures obtained from geometry optimization with MP2 (blue), PBE+MBD (red), and PBE+D3 
(yellow). In both a) GEA03 and b) WG03, the MBD correction opens the cleft between the backbone and aromatic side-chain 
as MP2 tends to over-bind dispersion interactions, c) In GGE04, PBE-t-MBD rotates the phenylalanine and alanine groups 
together, d) In FGG215, since the side-chain is farther away from the backbone, PBE-I-MBD matches the MP2 geometry 
almost exactly. 


TABLE I. Selected distances of DFT gas-phase optimized 
geometries of the GgQ@GgQH 2 g host-guest complex and con- 
former a of the host alone com pared to X-ray crystal struc¬ 
tures o f Cap @GKnHoa-2PhMj^^and the unsolvated buckyball 
catcher.ll^ The TPSS functional does not identify conformer 
a, so these entries are left blank 




Gomplex 


Host 

a 

Method 

R. (A) 

Rp (A) 

Rt (A) 

Rp (A) Rt (A) 

PBE+MBD 

8.312 

12.992 

6.303 

13.263 

6.394 

PBE+TS 

8.361 

12.974 

6.337 

12.969 

6.080 

PBE+D3 

8.454 

12.987 

6.286 

11.640 

6.215 

TPSS+D3 

8.392 

12.748 

6.288 

- 

- 

TPSS+D3“ 

8.361 

12.822 

6.303 

- 

- 

B97-D'’ 

8.335 

12.798 

6.299 

11.152 

6.216 

M06-2L= 

8.136 

12.703 

6.382 

11.844 

6.322 

X-ray‘^’= 

8.484(3) 

12.811(4) 6.418(5) 9.055(2) 6.44(3) 

“ Ref. [T32 

“ Ref. imi = Ref. [TMl “ Ref. [HSl = Ref. [T2S] 


approximately 10% of the interaction energy.l^fi^i^ Moti¬ 
vated by this large contribution of beyond-pairwise dis¬ 
persion, we optimized the Cgg@CgQH 2 g complex with 
PBE+MBD, PBE+TS and PBE+D3 to see how signif¬ 
icantly many-body effects impact the geometry. Con¬ 
taining 148 atoms, this system also represents a struc¬ 
ture that would be too large to optimize with numerical 
MBD gradients or high-level wavefunction based method¬ 
ologies. All theoretical calculations reported herein are 
for an isolated, i.e. gas-phase, host-guest complex at the 
classical equilibrium geometry at zero temperature, while 
the experimental values listed in Table |T] correspond to 
X-ray determined crystal structures measured at finite 
temperature. Since the base of the buckyball catcher 
host is quite flexible,fi^ we expect the packing environ¬ 
ment in the solid state to potentially impact the reported 
conformation. 

The buckyball catcher host is made of a tetrabenzo- 
cyclooctatetraene (TBCOT) tether and two corannulene 
pincers (cf. Fig. 1^ in the ESp^ and Fig. [^herein). The 
conformation of the catcher is determined by a compe¬ 
tition between the attractive dispersion interactions be¬ 
tween the corannulene pincers and t he s train induced by 
deformation of the TBCOT tether.^i^ The two lowest 


energy “open” conformers of the catcher have the coran¬ 
nulene bowls in a convex-convex “catching” motif or in a 
convex-concave ‘Svaterwheel” motif; following the nota¬ 
tion of Refs. II25I126I130L we term the “catching” motif a 
and the “waterwheel” motif b. 

To compare the size of the cleft between the coran¬ 
nulene pincers when the buckyball catcher is optimized 
with various DFT+vdW methods, we report the distance 
between the most separated carbon atoms of the cen¬ 
tral five-membered rings of both corannulene subunits as 
a measure of the size of the cleft; we denote this dis¬ 
tance as Rp (cf. Fig. 1^. Closing of the cleft tends to 
be accompanied by outward deflection of the TBCOT 
tether, so we also measure the distance between termi¬ 
nal carbons on the tether; we denote this distance as 
Rt (cf. Fig.|4|. Likewise, we measure the distance be¬ 
tween the centroid of the Cgg and the plane that bisects 
the TBCOT tether at the base of the buckyball catcher 
(cf. Fig.|§; we denote this distance as Re- Interestingly, 
several of the functionals that have been used to study 
the buckyball catcher do not identify all four conform¬ 
ers. Notably, TPSS-D3 is prone to drive conformer a to 
a closed variant that has Rp = 5.53 A. With regard to 
the balance between dispersion and strain, conformer a 
results when the Cgg is removed from the pincers and the 
host is allowed to relax. We will focus our discussion on 
the relaxed conformer a and the optimized complex, but 
we also OTOvide optimized structures of conformer b in 
the FSI.Eni 

Upon optimization with PBE+MBD we And that 
the corannulene pincers deflect outward, as seen 
by the increased Rp distance relative to the start¬ 
ing TPSS+D3/def2TZVP geometry from the SI2L 
dataset .1^^ The Rp distance predicted by PBE+MBD 
is larger than other results from vdW-inclusive func¬ 
tionals (see Table |^, which may be consistent with 
previous reports of three-body and higher order terms 
substantially decreasing the b inding energy of the 
CgQ@CgQH 2 g host-guest complex.lSSllsll However, this de¬ 
flection is accompanied by a reduction of the buckyball- 
catcher distance Rc, which would suggest a tighter 
binding. Just as with the reduced cleft distances 
in the peptides and the inter-monomer distance in 
the benzene dimer, we And that the host-guest dis- 
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FIG. 4. Overlay between the geometry of the C 5 q@C 5 oH 28 
host-guest complex optimized with PBE+D3 (red) and 
PBE+MBD (blue). The distance, Rc, between the CgQ cen¬ 
troid and the plane bisecting the tetrabenzocyclooctatetraene 
(TBCOT) tether (transparent green) is reduced from 8.45 A 
with PBE+D3 to 8.31 A with PBE+MBD. The green ar¬ 
row shows that the Rt distance is measured between termi¬ 
nal carbon atoms on the TBCOT tether. The yellow arrow 
shows that the Rp distance is measured between the most 
separated carbon atoms of the central five-membered rings of 
both corannulene subunits. Inset: The 2D molecular struc¬ 
ture of the CgQH 2 g buckyball catcher host, with corannulene 
subunits shown in blue and the TBCOT tether shown in red. 
Atoms used to define the Rt and Rp distances are marked in 
green and yellow respectively. The black dot shows the cen¬ 
troid of the four atoms on the TBCOT tether used to define 
the Rc distance. 


tance predicted by PBE+MBD (i?c = 8.31 A) is smaller 
than that predicted by PBE+D3 {Rc = 8.45 A) and 
PBE+TS {Rc = 8.36 A). For comparison, we also opti¬ 
mized the complex with TPSS+D3/def2TZVP and found 
a buckyball-catcher distance of Rc = 8.39 A, which is 
slightly larger than the Rc = 8.36 A in the previously 
reported TPSS+D3/def2TZVP geometry in the S12L 
dataset These results are reported in Table ||] together 
with a comparison to previous vdW-inclusive DFT re¬ 
sults and the corresponding distances from the X-ray de¬ 
termined crystal structures. 

The X-ray structure for the complex is taken from 
CgQ@CgQH 2 g co-crystallized with t wo d isordered toluene 
molecules, i.e. CgQ@CggH 28 - 2 PhMe.ti^In the solid state, 
the fullerenes form columns along the a-axis, while the 
buckyball catcher aligns back-to-back in the &c-plane. 
These back-to-back interactions have fewer atoms that 
are in van der Waals contact, but could still push the 
corannulene units together slightly. Zabula et al. re¬ 
cently obtained an X-ray crystal structure of the un¬ 
solvated buckyball catcher which adop ts an inter-locked 
structure similar to conformer a .14^ This inter-locked 
structure provides an attractive vdW interaction be¬ 
tween corannulene units, which causes the cleft to close 
{Rp = 9.055(2) A), with a corresponding outward deflec¬ 


tion of the TBCOT tether {Rt = 6.44(3) A). 

Perhaps the most unusual trend in Table is the sub¬ 
stantial opening of the cleft between the corannulene sub¬ 
units, and the accompanying outward deflection of the 
TBCOT tether, when the isolated host is optimized with 
the PBE+MBD method. Comparing the Rp and Rt dis¬ 
tances, we And an ordering of PBE+MBD > PBE+TS 
> PBE+D3. Miick-Lichtenfeld et al. previously found 
that the TBCOT tether is quite flexible, resulting in a 
shallow bending potential (see Fig. 2 of Ref. I126p as 
the Rp distance is varied; using the B97-D functional 
and 6-31G* basis set, the energy of conformer b varies 
b y on ly ~ 1.3 kcal/mol as Rp is scanned from 10-14 
Aim] Comparing the energy of the buckyball catcher in 
the strained conformer that it adopts when hosting the 
buckyball, to its energy when fully relaxed, we see that 
at the PBE+D3/def2TZVP level this strain energy is 
1.02 kcal/mol. This is consistent with the shallow bend¬ 
ing potential found by Miick-Lichtenfeld et al. Given 
how flat this PES is, it is less surprising that the three 
vdW corrections considered give such different relaxed 
Rp distances for the isolated host. 

The structure of the Cgg buckyball does not vary signif¬ 
icantly between different vdW-inclusive functionals. The 
PBE+MBD optimized structure of Cgg has C-C bond 
lengths of 1.45192(5) A for bonds within flve-membered 
rings (fusing pentagons and hexagons), and 1.39804(3) A 
for bonds fusing hexagonal rings; which compares favor¬ 
ably to the well known gas-phase elec tron diffraction re¬ 
sults of 1.458(6) A and 1.401(10) This result is 

consistent with the short-range behavior of the range- 
separated PBE+MBD method, which essentially reduces 
to the bare PBE functional and does a good job of pre¬ 
dicting C-C bond lengths. 

On the whole we And that the PBE+MBD method 
yields structures that are comparable to other vdW- 
inclusive functionals but deviates more significantly 
from the X-ray determined crystal structure than the 
PBE+D3 results. Since we do not have an experimentally 
determined gas-phase structure or a wavefunction theory 
reference for the Cgg@CgQH 28 host-guest complex, the 
deviation of the gas-phase PBE+MBD optimization from 
the experimental crystal structure should not be taken as 
a benchmark comparison. Future work will address the 
optimization of this full crystal structure. 

In light of the lack of high-level wavefunction-based ge¬ 
ometries to compare against, we conclude with a few com¬ 
ments about the computational efficiency of our method. 
Starting from the TPSS/def2TZVP structures from the 
S12L dataset, we were able to optimize the 148-atom 
complex with the PBE+MBD method in 68 BEGS steps 
in about 415 cpu hours, while the PBE+D3 optimizat ion 
in Orca took 34 BEGS steps in about 450 cpu hours.^^^ 
Given that Orca uses redundant internal coordinates for 
geometry optimizations and the D3 correction is almost 
instantaneous to calculate, it is worth noting that the 
Cartesian coordinates optimization in QE with the much 
more costly MBD correction is roughly competitive. 
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D. The importance of dV. 

Our derivation of the nuclear MBD forces placed con¬ 
siderable emphasis on the importance of including the 
implicit coordinate dependence arising from the gradi¬ 
ents of the Hirshfeld effective atomic volumes. To test 
how large of a contribution that the dV terms make to 
the MBD forces, we re-optimized the benzene dimers, this 
time setting dV — 0 explicitly. As shown in Fig.j^in the 
ESI Eol neglect of the Hirshfeld volume gradients does not 
have a large impact for this system, in which the disper¬ 
sion forces are intermolecular; the mean RMSD becomes 
(16 ± 5) X 10“^ A. This result is expected for this sys¬ 
tem because the Hirshfeld effective atomic volumes only 
change when nearest neighbor atoms are moved. Not 
only is the benzene monomer fairly rigid, but the range 
separation employed in MBD means that the long-range 
tensor Tlr, and correspondingly the MBD correction, is 
largely turned off within the benzene monomer (see Fig.[^ 
in the ESPl). 

We expect a larger impact from Hirshfeld volume gra¬ 
dients for systems that are flexible and large enough for 
the damping function to have “turned on” the MBD cor¬ 
rection. The case of polypeptide intramolecular disper¬ 
sion interactions matches both of these criteria. We com¬ 
puted the MBD forces on the final optimized geometries 
of all 76 peptide structures and analyzed the atom by 
atom difference in the forces computed with and without 
the Hirshfeld volume gradients. As shown in Figure 
neglect of the Hirshfeld gradient causes a significant shift 
in the distribution of the MBD forces in the peptides, 
with a tendency to increase the forces from the lower 
peak from ^ 2 x 10“^ Eh/a.u. to ~ 4 x 10“^ Eh/a.u.. 
Comparing the Cartesian components of the MBD forces 
across all atoms in all 76 structures we find that the de¬ 
viations between MBD forces with and without the Hir¬ 
shfeld volume gradients (F — ^-re approximately 

normally distributed with zero mean and a standard de¬ 
viation of 2 X 10“^ Eh/a.u. (see Fig. §intheESP^. This 
leads to the norm of the force difference (A||F — Fav||) 
having a mean of (3.2 ± 1.7) x 10“^ Eh/a.u., and a 
mean of the difference of norms of ||F|| — ||Fav=o|| = 
(—5±17) X 10“^ Eh/a.u. Overall, neglect of the Hirshfeld 
gradients increases forces and causes a long-tailed distri¬ 
bution of relative error, that is peaked at ^ 20%, but 
extends up to 400%. This large distribution of relative 
errors has the potential to significantly impact the deter¬ 
ministic nature of ab initio molecular dynamics (AIMD) 
simulations run at the MBD level of theory that do not 
properly account for the analytical gradients of the Hirsh¬ 
feld effective volumes. Given that this error would accu¬ 
mulate at every time step, combined with the fact that 
the MBD correction was found to be quite important 
in the geometry optimizations of the systems considered 
herein, we find the neglect of the Hirshfeld effective vol¬ 
ume gradients to be an unacceptable approximation in 
AIMD. This finding is particularly true for large flexible 
molecular systems with significant intramolecular disper¬ 


sion interactions since this error can cooperatively in¬ 
crease along any extended direction, i.e., along an alkane 
chain or polypeptide backbone. 

V. CONCLUSIONS AND FUTURE RESEARCH 

By developing analytical energy gradients of the range- 
separated MBD energy with respect to nuclear coordi¬ 
nates, we have enabled the first applications of MBD to 
full nuclear relaxations. By treating the gradients of the 
MBD energy correction analytically, rather than numer¬ 
ically, we have reduced the number of self-consistent cal¬ 
culations that must be performed from 2 x {3N — 6) to 1, 
enabling treatment of much larger systems. Our deriva¬ 
tion and implementation includes all implicit coordinate 
dependencies arising from the Hirshfeld charge density 
partitioning. In the isolated molecule optimizations that 
we considered herein, the implicit coordinate dependen¬ 
cies that arise from the Hirshfeld volume gradients re¬ 
sulted in significant changes to the MBD forces. The 
long-tailed distribution of relative error that we observed 
indicates that any future AIMD simulations employing 
MBD forces must include full treatment of the Hirsh¬ 
feld volume gradients, or the accumulation of error will 
negatively impact the simulation dynamics. Our care¬ 
ful treatment of these volume gradients paves the wave 
for future work to address how a self-consistent imple¬ 
mentation of the MBD model will impact the electronic 
band structures of layered materials and intermolecular 
charge transfer couplings in molecular crystals. A fully 
self-consistent treatment of MBD will likely be required 
for energy conservation in AIMD simulations. 

Consistent with previous findings that a many-body 
description of dispersion improves the binding energies 
of even small molecular dimers,^ we find that MBD 
forces significantly improve the structures of isolated dis- 
persively bound molecular systems displaying both inter¬ 
molecular and intramolecular interactions. We find ex¬ 
cellent agreement between PBE+MBD optimized struc¬ 
tures and reference PBE/CCSD(T) and MP2 geometries. 
Notably, PBE+MBD consistently outperformed the pair¬ 
wise PBE+D3(BJ), and effectively pairwise PBE+TS op¬ 
timizations. 

The first applications of MBD forces in this paper 
were restricted to gas-phase systems because computa¬ 
tion of MBD gradients in the condensed phase, where 
periodic images of the unit cell must be considered, is 
substantially more challenging from a computational per¬ 
spective. Converging the MBD energy in the condensed 
phase is demanding (from both the memory and com¬ 
putational point of view) due to a real-space supercell 
procedure that is required to support long-wavelength 
normal modes of A forthcoming publication will 

describe the details of our implementation of MBD forces 
for periodic systems, including careful treatment of par¬ 
allelization and convergence criteria.!^ 
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FIG. 5. Left: Gaussian kernel density estimate of the distributions of the norm || • || of MBD forces Fmbd acting on each 
atom at the optimized geometries of 76 tripeptide structures. In blue, the MBD forces were computed with full Hirshfeld 
gradients (||F||); in yellow, the forces were computed with the Hirshfeld gradients dV set to zero (||Fav=o||). Right: Gaussian 
kernel density estimate of the distribution of relative percentage error |j AF||/||F|| where AF = F — Fay^o is the error incurred 
by setting the Hirshfeld gradients to zero. The distribution is peaked at approximately 20% but extends to values much greater 
than 100%. 


Since MBD forces are very efficient to evaluate for gas- 
phase molecules, we are eager to explore the applica¬ 
tion of MBD to AIMD simulations. Many-body effects 
have previously been shown to be significant in modeling 
solvation and aggregation in solutiorPi^ and can lead to 
soft colle ctiv e fluctuations that impact hydrophobic as¬ 
sociation,and the entromc stabilization of hydrogen- 
bonded molecular crystals.lS^ We therefore anticipate that 
our many-body forces will be of interest for solvated sim¬ 
ulations, such as e stim ates of the thermodynamic proper¬ 
ties of metabolite^i^ and modeling novel electrolytes.^^^ 
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VII. ELECTRONIC SUPPLEMENTARY INFORMATION 
A. Symbol glossary 


Symbol Description 


afr-(O) 

aa(0) 

aa(iw) 

aa(iw) 

A(iia;) 

A(iia;) 

aa(0) 

aa(iw) 

, .free 

C free 
6,aa 

Ce^aa 

Ce^aa 

Wa 

Wa 

Vp 

9p 

CTa(iw) 

Aa(iw) 

■yfree 
^ a 

Va 

Sab(iw) 


static free-atom polarizability formed with scalars on the diagonal 

Eq. (5l: static bare polarizability, related to q:^®®( 0) by the ratio Ea/U^''®® 

Eq. (8l: frequency-dependent bare polarizability tensor, calculated by Fade approximant Eq. Q 
‘isotropized’ bare dipole polarizability scalar, calculated as = |Tr[Q;a] 


Eq. (10) 


Eq. (12) 


Eq. (M) 
Eq. (13) 


bare system polarizability tensor, 3iV x 3N block diagonal matrix of Q:a(iw) 
screened system polarizability tensor, solved at complex frequency iw using [A“ 
screened static polarizability calculated by partial contraction of A(0) 
screened frequency-dependent polarizability. 


■ Tsi^] 


-1 


calculated by partial contraction of A(iia;) 
free-atom QHO excitation frequency, computed as w^™® = 4/3 ^Cg’’®® / [Q!a®®(0)] 
free-atom Cq coefficient (also called Hamaker constant) 

bare effective atomic Ce coefficient computed by weighting C|"'®® with (Va/V'/''®®)^ 
Eq. (15): screened effective atomic Cq coefficient 

bare QHO excitation frequency, equals w^"'®® due to cancellation of Vq/U/'^®® factors 
Eq. (16): screened QHO excitation frequency 


frequency grid for numerical integration 
weights for numerical integration 

Eq. (25): QHO width, calculated from the bare polarizability scalar 
Eq. (26): multiplicative prefactor to Va in defining cca 


free-atom effective volume 


Eq. (63): Hirshfeld effective atomic volume 

Eq. (24): effective correlation length of the interaction potential. 


defined from QHO widths of atoms a and b 
5la nuclear position of atom a 

Rab internuclear distance between atoms a and b 

Rab internuclear vector (IRq — IRt,) between atoms a and b 

Rab Cartesian component of the internuclear vector 

r spatial position such as the argument of the electron density 

C ratio between interatomic separation R and correlation length 

/i(C) Eq. (59): function of C appearing in T 
v{R,iuj) Eq. (22): Coulomb interaction between two QHO Gaussian 


charge densities separated by R with frequency-dependent interaction 


T 

Tdip 

Tsi^ 

Tlr 

f{Z) 

Sab 

Zab 

7 ?vdW,TS 

a 

—vdW 
a 

P 

pf 

Psad 

qMBB 

AT 

A 

dc 

A'MBD 

pMBD 


Eq. (58) 
Eq. (28) 


Eq. (36) 


Eq. (38) 
Eq. (29) 


dipole interaction tensor between QHO Gaussian charge densities 

dipole interaction tensor between point dipoles 

short-range component of T, evaluated using 

long-range component of T, evaluated using /(Z^dw) 

damping function for range-separation of the dipole interaction tensor 


sum of effective vdW radii scaled by jS 
ratio of interatomic separation Rab and Sab 


Eq. (35): effective vdW radii at the TS level 


Eq. (37): screened effective vdW radii 


total electronic charge density 


Eq. (62): Hirshfeld effective electron density assigned to atom a 
sum of spherical free-atom densities pg"'®® 


Eq. (18): MBD model Hamiltonian 


Eq. (19): MBD interaction matrix 
matrix of eigenvectors of 
vector of eigenvalues of 

gradient with respect to nuclear position of atom c, equivalent to 
MBD correlation energy 
MBD ionic forces: — V 3 j^Embd 






























20 


B. Length Scale of Damping 



FIG. 6. Left: Contours at 10 ® for damping functions exp [—C^] (purple) and (1 — / [Z]) (orange), with ||R|| relative to the 
atom marked in red. Damping parameters E ~ 1.03 and S ~ 2.96 (cf. Eqs. \2A\ and @) were computed for a graphene 
nanofiake with the PBE functional. Right: Comparison of the three damping functions with the same 10“® contour indicated. 
The rapid decay of exp [—C^] relative to the Fermi damping function demonstrates that the short-range dipole-dipole interaction 
tensor Tsr reduces to the frequency-independent Tdip well before the long-range tensor Tlr has been fully “turned on” by the 
Fermi damping function (cf. Eqs. ( |27| |28[[M}|38[ )). 


C. Computation of dV 

Nuclear coordinate forces within a fully self-consistent 0{N) implementation of the Tk atch enko-Scheffler (TS) 
schem^^were previously developed in Quantum ESPRESSO (QEJ^by R. A. DiStasio Jr.hdSJ A subroutine of the 
TSVDW module computes the Hirshfeld partitioning into effective atomic volumes, 14, and the derivatives of that 
volume, dVa- The Hirshfeld effective charge density of atom a is: 

pf (r) = zcjr) p(r) = p(r), (62) 

where p(r) is the total molecular charge density and Psad(r) = ~ ^bll) ^^e sum of free-atom densities. 

The effective volume is then: 


K = y'dr||r-Xfpf(r). (63) 

Integrations on spherical atomic domains, such as in Eq. |63[ are computed on subsets of the real-space mesh. Using 
reference data for the free atom volumes, the radial grid cutoff value is determined for each species such that the free 
atom volume obtained by numerical integration up to this cutoff does not deviate from the reference value by more 
than 1.0%. The effective volume derivative is evaluated as 


drVa = / dr llr - X 


^dcpf{v)-?,5ca J dr (r-3la)|lr-3la||Pa®(i’) 


9cpf{r) = 


p*®®(||r-31^11) p(r) 


[Psad(r)]' 


P(i') 

Psad(l*) 


r — Jlr 


-Jlr 


apr^(r) 

dr 


(64) 

(65) 


Note that the free-atom density is spherically symmetric, which is why we reduce OcPc^dlr —3ic||) to a spherical coor¬ 
dinate derivative Likewise, Eq. (631 is evaluated by mapping the radial form of p®® to an linear/equispaced 

grid, which is then interpolated using cubic splines. After interpolat ion, the derivative dcp’^ at each grid point is 
evaluated by numerical differentiation using Bickley’s 7-point formula.^ 
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D. Repeated Eigenvalues of C'^ 


In considering the derivative of Ap, in Eq. (421 we assumed that had 3N distinct eigenvalues. Due to numerical 


perturbations it is somewhat unlikely for ]^g^yg repeated eigenvalues, but we cannot assume this a priori. 

The procedure for taking derivatives of repeated eigenvalues of a real, symmetric matrix, like is essentially first 

order perturbation theory where the perturbation is the action of the derivative operator d(.. Eigenvalue degeneracies 
are lifted by diagonalizing the per turb ation in the dege nera te subspace. For a more algorithmic discussion of repeated 
eigenvalue derivatives, see FriswelP^ or Andrew et Since is real and symmetric, it is guaranteed to be 

diagonalizable with orthogonal eigenvectors. 


E. Importance of dV 


1. Benzene dimer 


To analyze the importance of dV, we re-optimized the benzene dimer structures with dV terms set explicitly to 
zero. As shown in Fig. setting dV = 0 slightly degrades the consistency of the PBE+MBD optimized geometries, 
but the final RMSDs are still quite good (all < 0.025 A). The optimization of Ml with dV = 0 proved numerically 
unstable, and was unable to converge, so Ml is not included in the figure. The fact that the Hirshfeld gradients 
have a negligible impact on the benzene dimer optimizations is expected since the Hirshfeld effective atomic volumes 
only change when nearest neighbor atoms are moved. In addition to being quite rigid, the benzene monomer is small 
enough that the range-separated MBD correction is largely turned off within the length scale of the monomer, which 
is where the Hirshfeld gradients could matter. 


Q 

(O 


0.025 -| 
0.020 - 
0.015 - 
0.010 - 
0.005 - 
0.000 - 


□ MBD □MBD(dy=0) 

I I I I I I I i| I I I I I I rr 


M2 SI S2 S3 S4 S5 S6 S7 S8 


FIG. 7. Root-mean-square-deviations (RMSD) in A between the PBE+MBD and PBE/CCSD(Tj^® optimized geometries of 
9 benzene dimer configurations using the full MBD gradient (shown in blue), and the approximation where dV contributions 
are set explicitly to zero (shown in grey). 


2. Polypeptides 


We also performed single-point calculations on the optimized geometries of all 76 tripeptide structures to compare 
the MBD forces computed with and without the dV contributions. The peptide structures are much more flexible 
than the benzene monomer and also have the opportunity for cooperative addition of the Hirshfeld volume gradients 
along the chain, i.e. the local Hirshfeld volume gradients acting at the nearest neighbor level can propagate along 
the peptide chain and result in a larger change. In Fig. we visualize the deviation between the forces computed 
with full Hirshfeld volume gradients and those computed with dV = 0 in several ways: difference of individual force 
components AF^ = — F^ norm of the difference of forces ||F — F9v=oll> relative percentage error || AF||/||F||, 

and distributions of the norms of forces ||F|| vs. IjF^F = 0||. 
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a.) b.) 




AF, (10-3 Eh/a.u.) ||AF|| (IQ-^ E^/a.u.) 


FIG. 8. Gaussian kernel density estimates of the distributions of MBD forces acting on each atom at the optimized geometries of 
76 tripeptide structures. a.)-c.) Difference of the force components AFi = Fi — Fi^gy^o, with a normal distribution Af{0, 0.2) 
and dotted line indicating zero mean superposed for reference, d.) Norm of the difference of forces, ||AF|j = ||F — Fav=o||, 
with the dotted line indicating that the peak occurs at ~ 0.3 x lO-^ Eh/a.u. 


F. Structure of the C6o@C6oH28 Buckyball Catcher Host—Guest Complex 

In Fig. [^the 2D molecular structure of the buckyball catcher host and the 3D structure of the CgQ@CgoH 2 g host- 
guest complex with the three distances Rc, Rp, and Rt are highlighted. For each DFT-vdW optimized structure 
of the host, we report the Rp and Rt distances. All geometry optimizations of the Cgn@CgQH 2 g buckyball catcher 
host-guest complex started from the TPSS+D3/def2-TZVP structures in the S12L set.^^H We optimized the complex, 
guest CgQ, and conformers a and b of the host. Structures of the complex, guest Cgg, and host optimized with other 
functionals and vdW correction schemes can be found in the supplemental information of the following references: 
Ref. [HI TPSS+D3/def2-TZVP, Ref. [Ml B97-D/TZVP, Ref. (Hi M06-2L/MIDI!. 
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FIG. 9. Left: 2D structure of the ‘buckyball catcher’ CgQH 28 . The central tetrabenzocyclooctatetraene (TBCOT) tether (red) 
links the two corannulene bowls (blue). The orange circles mark the four atoms used to define the Rt distance between the 
back ends of the TBCOT tether. The green circles mark the four atoms used to define the plane from which the distance to 
the CgQ centroid, Rc, is measured. The purple circles mark the two atoms used to define the Rp distance (ClOe and ClOe’ in 
the notation of Ref. ITM|l . which are the most separated atoms of the central five-membered rings of both corannulene subunits. 
Right: 3D structure of the C 6 o@C 6 oH 28 complex with the three distances Rc, Rp, and Rt highlighted. 


G. Self-Consistent Screening 

Self-consistent screening (SCS) is accomplished by solving the following non-homogeneous system of linear equations 
at a given complex frequency iw (Eq. (17) in DiStasio et al.^^: 


N 


aa{ioj) = aa{iuj) - ttaiiuj) ^ Tab ctaiiuj). 

b^a 


( 66 ) 


To accomplish a range-separated self-consistent screening (rsSCS), we replace T with Tsr (see Ref. [K!?]| . Eq. (66) can 
then be written as a matrix equation as: 

A = A-ATsrA (67) 

Note that (^pp = 0 so Tpp = 0 naturally (see Eq. @). Thus, the sum J2b^a ^ob cia is accomplished by the product 


Tsr a. Rearranging Eq. ( |67[ ) and then left multiplying by A ^ gives: 

A -f- A Tsr A = A 
A'^ [I + A Tsr] a = A^iA 
[A'l-f Tsr] A = I 


Left multiplying by the inverse of the bracketed quantity yields: 

A= [A^i+Tsr] 


-1 


( 68 ) 

(69) 

(70) 

(71) 


H. Derivation of 5T“- 


To break the derivative of T*-* into smaller pieces, we define some convenience functions: 

U = erf [C] - -^C exp [-C^] 


= 




R5 


4c^exp[-C^ 


-'■dip — 


i?5 


i?3 


(72) 

(73) 

(74) 






























So in terms of these functions, is: 


4 

dT^i = UdT%^ 


+ Tdip^t/ + aw®^' 


The derivative of T^jp is given in Eq. (54l. Note that we can write d {WW in terms of ax^jp as: 

'r^rR 


a 


i?5 


— —-ax®-^ — —8R 

- gC'^dip 


So the derivatives of U and W®-' are: 


du = ^eexp[-e] ac 
aw®^' = 


TT 

R^R^ 


1 


Sii 


[3 - 2C^] exp [-C^] ac + ^c" exp [-C^] ( -^ax^.p - ^ai? 


Now define h{() = exp [-C^] • 

^du = h{C)dC 


aw®J = 


R^R3 


^ 1 [3 - 2C^] hiOdC + CMC) (-3^T:i';p - ^ 


',dR 


In terms of /i(C) we can then write ax®'^ as: 

iMC) 


ax®^ = 


erf [C] - 


2 C 


ax^,p + CMC) -,9t:/,p- 


1 _ Mian 

3 dip 


Ldip ' 


i?®i?J 


[3-2C2] 


(75) 

(76) 


(77) 

(78) 

(79) 

(80) 
(81) 

h(C)aC (82) 


Where the derivative of Cat) is in Eq. (601. 


I. Scaling of Gauss-Legendre Quadrature 


To transform Gauss-Legendre quadrature from the interval Xp € 
we map the abscissa Xp and weights Wp with an algebraic scaling: 


[—1,1], to the semi-infinite interval Pp € [0,oo), 


Pp e [ 0 , 00 ) 


2L 


(83) 

(84) 


There are many different possible transformations to [0, 00 ), b ut t he algebraic mapping is quite robust for quadrature 
of functions f{x) that decay alge braic ally in jxj as x —>■ 00 Since the isolated atom dynamic polarizability is 
expected to decay as Q!(ia;) (x i/d;2 1iill we found the algebraic scaling to be preferable, although other choices such as 

Pp = Ltan j also perform well. Since the number of Gauss-Legendre quadrature points, n, determines the 

number of self-consistent screening computations that must be performed to determine uja, the computational cost 
(and numerical error) of evaluating the MBD correlation energy can be varied by adjusting the number of quadrature 

points. The quadrature error is also sensitive to the scale factor L. _ 

Based on the available atomic dynamic polarizability reference data in Derevianko et al.lSH and the free atom 
reference quantities used in the TS method,!^ our quadrature method should be able to integrate a response function 
for excitation frequencies in the range of uii ^ 0.06 (K) to uii ^ 1.2 (Ne). In optimizing the number of Gauss-Legendre 
points, n, and the scale factor, L, we used the Gasimir-Polder integral for a single excitation frequency dipole oscillator 
as a trial function with oJa varying in the range [10“^, 10^]. 
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(85) 


Using this trial function, we choose n = 20 quadrature points and a scale factor L = ^, which gives integration 
with a relative error less than 10“® for all excitation frequencies in the range [0.07, 5] and also performed well in 
self-consistent screening computations across a range of isolated atomic systems. 
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TABLE II. Convergence tolerances and unit cell sizes used in PWSCF geometry optimizations, reported in Rydberg atomic 
units (1 Ry = | Eh) 


Benzene Dimer Peptides CgQ Catcher 


Escf (Ry) 

10 “® 

10 “® 

10 "® 

Ecut (Ry) 

400 

145 

110 

Etot (Ry) 

10 “® 

5 X 10“^ 

5 X 10"'^ 

Ftot (Ry/ao) 

lO""* 

5 X 10-^ 

5 X lO"'^, 

Cell Size (ao) 

30 

30 

50 

Grid Spacing (A) 

0.04 

0.07 

0.12 


TABLE III. Convergance tolerances used in Orca geometry optimizations, reported in Hartree atomic units 


Benzene Dimer Peptides CgQ Catcher 


Escf (Eh) 

10"® 


10"® 

10"® 

Etot (Eh) 

10"® 

5 

X 10"® 

10"® 

FMax (Eh/ao) 

lO"'^ 

3 

X 10""^ 

10"^ 

Frms (Eh/ao) 

3 X 10"® 


10""^ 

3 X 10"® 

Max Disp. (ao) 

10"® 

4 

X 10"® 

10"® 

RMS Disp. (ao) 

6 X 10"^ 

2 

X 10"® 

6 X 10""* 


J. Additional Computational Details 

All geo metry o ptimizations were performed using the quasi-Newton Broyden-Fletcher-Goldfarb-Shanno (BFGS) 
algorithm, h^SlllsiJ with default parameters. 


1. Quantum ESPRESSO 


Gartesian coordinate geometry optimizations in Quantum ESPRESSO (QE)P^ were performed in the PWsCF 
module in large simple cubic unit cells. Table [TTjgives details of the convergence tolerances, kinetic energy wavefunction 
cutoffs, and unit cell sizes used for each system. Since QE uses Rydberg energy units (1 Ry = ^ Eh), we report the 
tolerances in these units. The PBE functionaP^ was used with Hamann-Schlueter- Ghi ang-Vanderbilt (HSGV) norm- 
conserving pseudopotentialP^ obtained fro m t he FPMD pseudopotential repositorjEH (and converted to UPF format 
using a modified version of QS02upf vl. All QE calculations were run at the F point using a charge density 

cutoff of pcut = 4Ecut- PBE+MBD jobs used 20 quadrature points for the Gasimir-Polder integration. To ensure a 
fair comparison with our implementation of the MBD model, all TS calculation^^ were performed as a posteriori 
corrections to the solution of the non-linear Kohn-Sham equations, i.e. we turned off the self-consistent density 


updates from TS. In Fig. 10 we present the results of convergence testing with respect to the kinetic energy cutoff in 
the planewave basis set expansion, showing that the total energy per atom was converged to better than 0.3 meV/atom 
for each system. 


2. Orca 

Redundant internal coordinate geometry optimizations in Orca vS.OlP^were performed with the PBE functionaP^ 
with the atom-pairwise version of the D3 dispersion correction of Grimme et aZ.,^ using Becke-Johnson (BJ) damp- 
ingim Orca v3.03 implements D3 in the DFTD3 v2.1r6 software, which does not contain analytical gradients of 
the t hree -body term. The geometric counterpoise correc tion (gGP) of Kruse et al. was employed in all Orca c alcula- 
tions.li^ We employed the Ahlrichs def2-TZVP basis setP^ coupled with an auxiliary Ahlrichs TZVP basis 
for the RI-J approximation.^ ^hi9 | i58 | calculations used the g4 final integration grid. All calculations used “tight” 
SGF tolerances; calculations on the benzene dimer and GgQ catcher used “tight” optimization tolerances, while those 
of the peptides used default optimization tolerances. Table [III] lists the tolerances corresponding to these two settings. 













26 



FIG. 10. Convergence of the total energy per atom with respect to kinetic energy wavefunction cutoff (Ecut) of the planewave 
basis expansion for Left: simulations of the Cgo catcher complex, Right: the benzene dimer (Ml configuration). 


K. Cartesian Coordinates of Structnres 


In the accompanying supplementary .txt files we provide the Cartesian coordinates (in A) of all structures considered 
in the text. 


1 . Stationary points of the benzene dimer potential energy surface 

We consider ten configurations of the benzene dimer, which correspond to stationary points of the SAPT(DFT(P^ 
potential energy surface in Ref. 111)41 U sing a fixed monomer geometry from the SDQ-MBPT(4)/cc-pVTZ results of 
Gauss and Stanton,I^Bludsky et al\^^ optimized these 10 configurations of the benzene dimer at the PBE/CCSD(T) 
level of theory, with an aug-cc-pVDZ basis set and counterpoise correction. The benzene dimer geometries may be 
found in the following files: 

benzene_monomer.txt benzene_dimer_ccsd.txt benzene_dimer_mbd.txt 

benzene_dimer_ts.txt benzene_dimer_d3.txt 


2 . Secondary structure of isolated polypeptides 


We considered 76 conformers of the following 5 isolated small peptides, GFA, FGG, GGF, WG, and WGG, con¬ 
taining the residues glycine (G), alanine (A), phenylalanine (F), and tryptophan (W). Our starting geometries were 
taken from www.begdb.co m, c orresp ond ing to the RI-MP2/cc-pVTZ optimized structures given in the supplemental 
information of Valdes et aZ.hdSI Table IV gives the correspondence between the structur e ind exing scheme used in this 
work, the nomenclature of the begdb database and the nomenclature of Valdes et Due to the ease of down¬ 

loading structures from the begdb database, we only present our PBE+MBD and PBE+D3 optimized geometries in 
the accompanying text files. The peptide geometries may be found in the following files: 


GFA_mbd.txt 
GFA_ts.txt 
GFA_d3.txt 


FGG_mbd.txt 
FGG_ts.txt 
FGG_d3.txt 


GGF_mbd.txt 
GGF_ts.txt 
GGF_d3.txt 


WG_mbd.txt 
WG_ts.txt 
WG_d3.txt 


WGG_mbd.txt 
WGG_ts.txt 
WGG_d3.txt 


3 . C6o@C6oH28 buckyball catcher host-guest complex 

The buckyball catcher host-guest geometries may be found in the following files: 

catcher_host.txt 
catcher_monomer.txt 
catcher_complex.txt 
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TABLE IV. Peptide naming conventions in this work, the begdb database, and Ref. 11161 


This work 

BEGDB 

Ref. [TT6l 

This work 

BEGDB 

Ref. ITTHl 

0 

252_ 

FGG55 

FGG_ 

055 

38 

228 

_GGF08 

GGF_08 

1 

263_ 

FGG80 

FGG_ 

080 

39 

230 

_GGF09 

GGF_09 

2 

253_ 

FGG99 

FGG_ 

099 

40 

225 

_GGF10 

GGF_10 

3 

264_ 

FGG114 FGG 

114 

41 

229 

_GGF11 

GGF_11 

4 

257_ 

FGG215 FGG_ 

215 

42 

224 

_GGF12 

GGF_12 

5 

258_ 

FGG224 FGG 

224 

43 

222 

_GGF13 

GGF_13 

6 

255_ 

FGG252 FGG_ 

252 

44 

221 

_GGF14 

GGF_14 

7 

254_ 

FGG300 FGG_ 

300 

45 

226 

_GGF15 

GGF_15 

8 

265_ 

FGG357 FGG_ 

357 

46 

214 

_WGG01 

WGG_01 

9 

256_ 

FGG366 FGG 

366 

47 

211 

WGG02 WGG 02 

10 

259_ 

FGG380 FGG_ 

380 

48 

209 

WGG03 WGG 03 

11 

260_ 

FGG412 FGG_ 

412 

49 

208 

WGG04 WGG 04 

12 

261_ 

FGG444 FGG_ 

444 

50 

210 

WGG05 WGG 05 

13 

262_ 

FGG470 FGG_ 

470 

51 

206 

WGG06 WGG 06 

14 

266_ 

FGG691 FGG_ 

691 

52 

215 

WGG07 WGG 07 

15 

248_ 

GFAOl 

GFA_ 

01 

53 

207 

WGG08 WGG 08 

16 

239_ 

GFA02 

GFA_ 

02 

54 

217 

WGG09 WGG 09 

17 

247_ 

GFA03 

GFA_ 

03 

55 

219 

WGGIO WGG 10 

18 

251_ 

GFA04 

GFA_ 

04 

56 

216 

WGGll WGG 11 

19 

250_ 

GFA05 

GFA_ 

05 

57 

220 

WGG12 WGG 12 

20 

245 _ 

GFA06 

GFA_ 

06 

58 

218 

WGG13 WGG 13 

21 

237_ 

GFA07 

GFA_ 

07 

59 

212 

WGG14 WGG 14 

22 

242 _ 

GFA08 

GFA_ 

08 

60 

213 

WGG15 WGG 15 

23 

241_ 

GFA09 

GFA_ 

09 

61 

195 

_WG01 

WG_01 

24 

238_ 

GFAIO 

GFA_ 

10 

62 

194 

_WG02 

WG_02 

25 

240 _ 

GFAll 

GFA_ 

11 

63 

191 

_WG03 

WG_03 

26 

244_ 

GFA12 

GFA_ 

12 

64 

204 

_WG04 

WG_04 

27 

243 _ 

GFA13 

GFA_ 

13 

65 

205 

_WG05 

WG_05 

28 

249 _ 

GFA14 

GFA_ 

14 

66 

193 

_WG06 

WG_06 

29 

236_ 

GFA15 

GFA_ 

15 

67 

197 

_WG07 

WG_07 

30 

246 _ 

GFA16 

GFA_ 

16 

68 

202 

_WG08 

WG_08 

31 

231_ 

GGFOl 

GGF 

01 

69 

198 

_WG09 

WG_09 

32 

234_ 

GGF02 

GGF_ 

02 

70 

192 

_WG10 

WG_10 

33 

233_ 

GGF03 

GGF_ 

03 

71 

203 

_WG11 

WG_11 

34 

227_ 

GGF04 

GGF 

04 

72 

201 

_WG12 

WG_12 

35 

235_ 

GGF05 

GGF 

05 

73 

200 

_WG13 

WG_13 

36 

232_ 

GGF06 

GGF 

06 

74 

196 

_WG14 

WG_14 

37 

223 

GGF07 

GGF 

07 

75 

199 

WG15 

WG 15 






